25 research outputs found

    Statistical inference with anchored Bayesian mixture of regressions models: A case study analysis of allometric data

    Full text link
    We present a case study in which we use a mixture of regressions model to improve on an ill-fitting simple linear regression model relating log brain mass to log body mass for 100 placental mammalian species. The slope of this regression model is of particular scientific interest because it corresponds to a constant that governs a hypothesized allometric power law relating brain mass to body mass. A specific line of investigation is to determine whether the regression parameters vary across subgroups of related species. We model these data using an anchored Bayesian mixture of regressions model, which modifies the standard Bayesian Gaussian mixture by pre-assigning small subsets of observations to given mixture components with probability one. These observations (called anchor points) break the relabeling invariance typical of exchangeable model specifications (the so-called label-switching problem). A careful choice of which observations to pre-classify to which mixture components is key to the specification of a well-fitting anchor model. In the article we compare three strategies for the selection of anchor points. The first assumes that the underlying mixture of regressions model holds and assigns anchor points to different components to maximize the information about their labeling. The second makes no assumption about the relationship between x and y and instead identifies anchor points using a bivariate Gaussian mixture model. The third strategy begins with the assumption that there is only one mixture regression component and identifies anchor points that are representative of a clustering structure based on case-deletion importance sampling weights. We compare the performance of the three strategies on the allometric data set and use auxiliary taxonomic information about the species to evaluate the model-based classifications estimated from these models

    Bayesian Synthesis: Combining subjective analyses, with an application to ozone data

    Full text link
    Bayesian model averaging enables one to combine the disparate predictions of a number of models in a coherent fashion, leading to superior predictive performance. The improvement in performance arises from averaging models that make different predictions. In this work, we tap into perhaps the biggest driver of different predictions---different analysts---in order to gain the full benefits of model averaging. In a standard implementation of our method, several data analysts work independently on portions of a data set, eliciting separate models which are eventually updated and combined through a specific weighting method. We call this modeling procedure Bayesian Synthesis. The methodology helps to alleviate concerns about the sizable gap between the foundational underpinnings of the Bayesian paradigm and the practice of Bayesian statistics. In experimental work we show that human modeling has predictive performance superior to that of many automatic modeling techniques, including AIC, BIC, Smoothing Splines, CART, Bagged CART, Bayes CART, BMA and LARS, and only slightly inferior to that of BART. We also show that Bayesian Synthesis further improves predictive performance. Additionally, we examine the predictive performance of a simple average across analysts, which we dub Convex Synthesis, and find that it also produces an improvement.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS444 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Rediscovering a little known fact about the t-test and the F-test: Algebraic, Geometric, Distributional and Graphical Considerations

    Full text link
    We discuss the role that the null hypothesis should play in the construction of a test statistic used to make a decision about that hypothesis. To construct the test statistic for a point null hypothesis about a binomial proportion, a common recommendation is to act as if the null hypothesis is true. We argue that, on the surface, the one-sample t-test of a point null hypothesis about a Gaussian population mean does not appear to follow the recommendation. We show how simple algebraic manipulations of the usual t-statistic lead to an equivalent test procedure consistent with the recommendation. We provide geometric intuition regarding this equivalence and we consider extensions to testing nested hypotheses in Gaussian linear models. We discuss an application to graphical residual diagnostics where the form of the test statistic makes a practical difference. By examining the formulation of the test statistic from multiple perspectives in this familiar example, we provide simple, concrete illustrations of some important issues that can guide the formulation of effective solutions to more complex statistical problems.Comment: 22 pages, 5 figure

    Case-deletion importance sampling estimators: Central limit theorems and related results

    Full text link
    Case-deleted analysis is a popular method for evaluating the influence of a subset of cases on inference. The use of Monte Carlo estimation strategies in complicated Bayesian settings leads naturally to the use of importance sampling techniques to assess the divergence between full-data and case-deleted posteriors and to provide estimates under the case-deleted posteriors. However, the dependability of the importance sampling estimators depends critically on the variability of the case-deleted weights. We provide theoretical results concerning the assessment of the dependability of case-deleted importance sampling estimators in several Bayesian models. In particular, these results allow us to establish whether or not the estimators satisfy a central limit theorem. Because the conditions we derive are of a simple analytical nature, the assessment of the dependability of the estimators can be verified routinely before estimation is performed. We illustrate the use of the results in several examples.Comment: Published in at http://dx.doi.org/10.1214/08-EJS259 the Electronic Journal of Statistics (http://www.i-journals.org/ejs/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Asymptotics of Lower Dimensional Zero-Density Regions

    Full text link
    Topological data analysis (TDA) allows us to explore the topological features of a dataset. Among topological features, lower dimensional ones have recently drawn the attention of practitioners in mathematics and statistics due to their potential to aid the discovery of low dimensional structure in a data set. However, lower dimensional features are usually challenging to detect from a probabilistic perspective. In this paper, lower dimensional topological features occurring as zero-density regions of density functions are introduced and thoroughly investigated. Specifically, we consider sequences of coverings for the support of a density function in which the coverings are comprised of balls with shrinking radii. We show that, when these coverings satisfy certain sufficient conditions as the sample size goes to infinity, we can detect lower dimensional, zero-density regions with increasingly higher probability while guarding against false detection. We supplement the theoretical developments with the discussion of simulated experiments that elucidate the behavior of the methodology for different choices of the tuning parameters that govern the construction of the covering sequences and characterize the asymptotic results.Comment: 28 page

    Vestibular Stimulation for ADHD: Randomized Controlled Trial of Comprehensive Motion Apparatus

    Get PDF
    Objective: This research evaluates effects of vestibular stimulation by Comprehensive Motion Apparatus (CMA) in ADHD. Method: Children ages 6 to 12 (48 boys, 5 girls) with ADHD were randomized to thrice-weekly 30-min treatments for 12 weeks with CMA, stimulating otoliths and semicircular canals, or a single-blind control of equal duration and intensity, each treatment followed by a 20-min typing tutorial. Results: In intent-to-treat analysis (n = 50), primary outcome improved significantly in both groups (p = .0001, d = 1.09 to 1.30), but treatment difference not significant (p = .7). Control children regressed by follow-up (difference p = .034, d = 0.65), but overall difference was not significant (p = .13, d = .47). No measure showed significant treatment differences at treatment end, but one did at follow-up. Children with IQ-achievement discrepancy ≥ 1 SD showed significantly more CMA advantage on three measures. Conclusion: This study illustrates the importance of a credible control condition of equal duration and intensity in trials of novel treatments. CMA treatment cannot be recommended for combined-type ADHD without learning disorder

    SARS-CoV-2 susceptibility and COVID-19 disease severity are associated with genetic variants affecting gene expression in a variety of tissues

    Get PDF
    Variability in SARS-CoV-2 susceptibility and COVID-19 disease severity between individuals is partly due to genetic factors. Here, we identify 4 genomic loci with suggestive associations for SARS-CoV-2 susceptibility and 19 for COVID-19 disease severity. Four of these 23 loci likely have an ethnicity-specific component. Genome-wide association study (GWAS) signals in 11 loci colocalize with expression quantitative trait loci (eQTLs) associated with the expression of 20 genes in 62 tissues/cell types (range: 1:43 tissues/gene), including lung, brain, heart, muscle, and skin as well as the digestive system and immune system. We perform genetic fine mapping to compute 99% credible SNP sets, which identify 10 GWAS loci that have eight or fewer SNPs in the credible set, including three loci with one single likely causal SNP. Our study suggests that the diverse symptoms and disease severity of COVID-19 observed between individuals is associated with variants across the genome, affecting gene expression levels in a wide variety of tissue types

    A first update on mapping the human genetic architecture of COVID-19

    Get PDF
    peer reviewe
    corecore